Skip to main content

Reply to this post | See parent post | Go Back
View Post [edit]

Poster: Moongleam Date: Dec 30, 2011 8:07am
Forum: feature_films Subject: Re: PD?

In most cases the text was correct in the printed book, and the error was introduced by the OCR process.

One cannot expect OCR to be perfect. If you examine printing in a paper book with a powerful magnifying glass, you'll see that the letters aren't perfectly formed. No two e's will be exactly the same. (There could have been too much or too little ink on the lead type; the paper could have imperfections, etc.) Furthermore, the OCR program doesn't know what typeface was used in the book, so it doesn't know the exact shape of the characters.

It is amazing that OCR works as well as it does (the algorithms used must be very sophisticated), but a literate human is much better at deciphering letters on paper.

This post was modified by Moongleam on 2011-12-30 16:07:31